Model Selection

Multimodal Image Segmentation

# Multimodal Image Segmentation

SESAME is an open-source multimodal model, fine-tuned on various instruction-based image localization (segmentation) datasets based on the LLaVA model.

Internvl2 5 HiMTok 8B

HiMTok is a hierarchical mask token learning framework fine-tuned on the InternVL2_5-8B large multimodal model, focusing on image segmentation tasks.

Segformer B0 Finetuned Food

An image segmentation model based on the Transformers library, supporting various image segmentation tasks.

Image Segmentation

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase